An integer linear programming approach for approximate string comparison

نویسندگان

  • Marcus Ritt
  • Alysson M. Costa
  • Sérgio Luis Sardi Mergen
  • Viviane Pereira Moreira
چکیده

We introduce a problem calledMaximum Common Characters in Blocks (MCCB), which arises in applications of approximate string comparison, particularly in the unification of possibly erroneous textual data coming from different sources. We show that this problem is NP-complete, but can nevertheless be solved satisfactorily using integer linear programming for instances of practical interest. Two integer linear formulations are proposed and compared in terms of their linear relaxations. We also compare the results of the approximate matching with other known measures such as the Levenshtein (edit) distance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Non-radial Approach for Setting Integer-valued Targets in Data Envelopment Analysis

Data Envelopment Analysis (DEA) has been widely studied in the literature since its inception with Charnes, Cooper and Rhodes work in 1978. The methodology behind the classical DEA method is to determine how much improvements in the outputs (inputs) dimensions is necessary in order to render them efficient. One of the underlying assumptions of this methodology is that the units consume and prod...

متن کامل

A Non-linear Integer Bi-level Programming Model for Competitive Facility Location of Distribution Centers

The facility location problem is a strategic decision-making for a supply chain, which determines the profitability and sustainability of its components. This paper deals with a scenario where two supply chains, consisting of a producer, a number of distribution centers and several retailers provided with similar products, compete to maintain their market shares by opening new distribution cent...

متن کامل

An L1-norm method for generating all of efficient solutions of multi-objective integer linear programming problem

This paper extends the proposed method by Jahanshahloo et al. (2004) (a method for generating all the efficient solutions of a 0–1 multi-objective linear programming problem, Asia-Pacific Journal of Operational Research). This paper considers the recession direction for a multi-objective integer linear programming (MOILP) problem and presents necessary and sufficient conditions to have unbounde...

متن کامل

RESOLUTION METHOD FOR MIXED INTEGER LINEAR MULTIPLICATIVE-LINEAR BILEVEL PROBLEMS BASED ON DECOMPOSITION TECHNIQUE

In this paper, we propose an algorithm base on decomposition technique for solvingthe mixed integer linear multiplicative-linear bilevel problems. In actuality, this al-gorithm is an application of the algorithm given by G. K. Saharidis et al for casethat the rst level objective function is linear multiplicative. We use properties ofquasi-concave of bilevel programming problems and decompose th...

متن کامل

Parameterized matching on non-linear structures

The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set Σ. In the parameterized pattern matching model, a consistent renaming of symbols from Σ is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications. In clas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • European Journal of Operational Research

دوره 198  شماره 

صفحات  -

تاریخ انتشار 2009